[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` #20527

DarkLight1337 · 2025-07-06T16:45:16Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

The recent change by #19978 broke ClassifierPooler because the activation function selection was based on the now-deprecated "score" task. To work around this issue, I added a use_cross_encoder flag to signal to the pooler which activation function to use. This flag is applied inside the _cross_encoding_score calls.

FIX #20461

cc @noooop @maxdebayser

Test Plan

Test Result

The test failures in #20461 are now fixed.

(Optional) Documentation Update

…n in `ClassifierPooler` Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

github-actions · 2025-07-06T16:45:26Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Summary of Changes

Hello @DarkLight1337, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request provides a critical fix for the ClassifierPooler by introducing a use_cross_encoder flag to explicitly guide the selection of the correct activation function. This resolves a regression that caused incorrect behavior for cross-encoder models and ensures proper functionality for both classification and cross-encoding tasks.

Highlights

Bug Fix & Activation Function Handling: Resolved a regression in ClassifierPooler where the activation function was incorrectly selected due to a reliance on a now-deprecated 'score' task. This ensures the correct activation function is applied for both classification and cross-encoding tasks, fixing test failures reported in #20461.
New use_cross_encoder Flag: Introduced a use_cross_encoder boolean flag within PoolingParams to explicitly signal whether a cross-encoder specific activation function should be used. This flag is set to True during LLM._cross_encoding_score calls.
Dynamic Activation Function Selection: Refactored ClassifierPooler to dynamically select and apply the appropriate activation function (either classification-specific or cross-encoder-specific) based on the new use_cross_encoder flag. This logic now handles batches where different pooling parameters might be present.
Centralized Activation Function Logic: The responsibility for determining the correct activation function has been centralized within ClassifierPooler and vllm/transformers_utils/config.py, removing redundant default_activation_function initializations from individual model implementations like BertModel and RobertaModel.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces the use_cross_encoder flag to refactor the activation function selection in ClassifierPooler. I've identified a critical issue in how mixed batches are handled and a potential regression for multi-label classification models. The suggestions aim to fix these issues to ensure the correctness of the implementation.

vllm/model_executor/layers/pooler.py

vllm/transformers_utils/config.py

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 · 2025-07-06T16:56:19Z

vllm/model_executor/models/bert.py

@@ -462,9 +460,6 @@ def __init__(self, *, vllm_config: VllmConfig, prefix: str = ""):
        super().__init__()
        config = vllm_config.model_config.hf_config

-        self.default_activation_function = \


This isn't used by the current module so I removed them

Isotr0py

Overall LGTM!

Isotr0py · 2025-07-06T17:35:00Z

vllm/model_executor/layers/pooler.py

+        if all(use_cross_encoder == use_cross_encoder_list[0]
+               for use_cross_encoder in use_cross_encoder_list):


Suggested change

if all(use_cross_encoder == use_cross_encoder_list[0]

for use_cross_encoder in use_cross_encoder_list):

if len(set(use_cross_encoder_list)) == 1:

I think we can simplify the condition here.

noooop · 2025-07-07T00:25:51Z

Very sorry.

When I set up using v0, it still failed, so a more detailed investigation should be conducted.

maxdebayser

LGTM

…lassifierPooler` (vllm-project#20527) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Sync to v0.9.2 + remove libsodium + [fix cachetokeziner](neuralmagic/nm-vllm-ent@1423512) git log: ``` commit 7b94527 (HEAD -> sync-v0.9.2, nm-fork/sync-v0.9.2) Merge: 1423512 d07be8a Author: Selbi Nuryyeva <selbi@redhat.com> Date: Fri Jul 11 07:03:51 2025 -0400 Merge remote-tracking branch 'nm-fork/main' into sync-v0.9.2 commit 1423512 Author: Isotr0py <mozf@mail2.sysu.edu.cn> Date: Mon Jun 30 18:16:16 2025 +0800 disable using CacheTokenizer for transformers >= 4.53.0 fixes vllm-project#20224 addendum to vllm-project#20244 commit d07be8a (nm-fork/main, nm-fork/HEAD) Merge: bbccdbe 02152ad Author: Daniele <36171005+dtrifiro@users.noreply.github.com> Date: Wed Jul 9 15:18:56 2025 +0200 Dockerfile*.ubi: remove libsodium (opendatahub-io#245) It's not needed anymore https://issues.redhat.com/browse/INFERENG-848 commit 7dd12da Merge: bbccdbe a5dd03c Author: Selbi Nuryyeva <selbi@redhat.com> Date: Tue Jul 8 10:08:37 2025 -0400 Merge branch 'v0.9.2-upstream' into sync-v0.9.2 commit a5dd03c (tag: v0.9.2rc2, tag: v0.9.2, upstream/releases/v0.9.2, v0.9.2-upstream, upstream-v0.9.2) Author: simon-mo <simon.mo@hey.com> Date: Sun Jul 6 14:02:36 2025 -0700 Revert "[V0 deprecation] Remove V0 CPU/XPU/TPU backends (vllm-project#20412)" This reverts commit e202dd2. commit c18b3b8 Author: Cyrus Leung <tlleungac@connect.ust.hk> Date: Mon Jul 7 05:01:48 2025 +0800 [Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` (vllm-project#20527) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> commit 9528e3a Author: Woosuk Kwon <woosuk.kwon@berkeley.edu> Date: Sun Jul 6 12:44:52 2025 -0700 [BugFix][Spec Decode] Fix spec token ids in model runner (vllm-project#20530) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> commit 9fb52e5 Author: Cyrus Leung <tlleungac@connect.ust.hk> Date: Mon Jul 7 00:54:36 2025 +0800 [V1] Support any head size for FlexAttention backend (vllm-project#20467) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> ``` Test: CUDA: https://github.com/neuralmagic/nm-cicd/actions/runs/16218517666 ROCM: https://github.com/neuralmagic/nm-cicd/actions/runs/16218578391

…lassifierPooler` (vllm-project#20527) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>

[Core] Add use_cross_encoder flag to use correct activation functio…

cff74b0

…n in `ClassifierPooler` Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 requested review from mgoin and Isotr0py July 6, 2025 16:45

DarkLight1337 requested a review from aarnphm as a code owner July 6, 2025 16:45

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 6, 2025

DarkLight1337 changed the title ~~[Core] Add use_cross_encoder flag to use correct activation functio…~~ [Core] Add use_cross_encoder flag to use correct activation in ClassifierPooler Jul 6, 2025

gemini-code-assist bot reviewed Jul 6, 2025

View reviewed changes

mergify bot added the frontend label Jul 6, 2025

gemini-code-assist bot reviewed Jul 6, 2025

View reviewed changes

vllm/model_executor/layers/pooler.py Outdated Show resolved Hide resolved

vllm/transformers_utils/config.py Outdated Show resolved Hide resolved

DarkLight1337 added 3 commits July 6, 2025 16:49

Update online serving

caa3e87

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix activation function

b863f05

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix

1567dc3

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 commented Jul 6, 2025

View reviewed changes

DarkLight1337 added this to the v0.9.2 milestone Jul 6, 2025

DarkLight1337 changed the title ~~[Core] Add use_cross_encoder flag to use correct activation in ClassifierPooler~~ [Bugfix] Add use_cross_encoder flag to use correct activation in ClassifierPooler Jul 6, 2025

Isotr0py approved these changes Jul 6, 2025

View reviewed changes

simon-mo merged commit c18b3b8 into vllm-project:main Jul 6, 2025
72 of 74 checks passed

DarkLight1337 deleted the pooler-use-cross-encoder branch July 7, 2025 02:38

noooop mentioned this pull request Jul 7, 2025

[Model] The ForSequenceClassification model should be controlled by override_pooler_config. #20538

Draft

4 tasks

maxdebayser reviewed Jul 7, 2025

View reviewed changes

huydhn pushed a commit to huydhn/vllm that referenced this pull request Jul 8, 2025

[Bugfix] Add use_cross_encoder flag to use correct activation in `C…

d247799

…lassifierPooler` (vllm-project#20527) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Chen-zexi pushed a commit to Chen-zexi/vllm that referenced this pull request Jul 13, 2025

[Bugfix] Add use_cross_encoder flag to use correct activation in `C…

aa80ce3

…lassifierPooler` (vllm-project#20527) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` #20527

[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` #20527

Uh oh!

DarkLight1337 commented Jul 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 Jul 6, 2025

Uh oh!

Isotr0py left a comment

Uh oh!

Isotr0py Jul 6, 2025

Uh oh!

Uh oh!

noooop commented Jul 7, 2025 •

edited

Loading

Uh oh!

maxdebayser left a comment

Uh oh!

Uh oh!

		if all(use_cross_encoder == use_cross_encoder_list[0]
		for use_cross_encoder in use_cross_encoder_list):

	if all(use_cross_encoder == use_cross_encoder_list[0]
	for use_cross_encoder in use_cross_encoder_list):
	if len(set(use_cross_encoder_list)) == 1:

Uh oh!

[Bugfix] Add use_cross_encoder flag to use correct activation in ClassifierPooler #20527

[Bugfix] Add use_cross_encoder flag to use correct activation in ClassifierPooler #20527

Uh oh!

Conversation

DarkLight1337 commented Jul 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jul 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Isotr0py Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

noooop commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxdebayser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` #20527

[Bugfix] Add `use_cross_encoder` flag to use correct activation in `ClassifierPooler` #20527

DarkLight1337 commented Jul 6, 2025 •

edited by github-actions bot

Loading

noooop commented Jul 7, 2025 •

edited

Loading